Incorporating Background Knowledge into Text Classification
نویسندگان
چکیده
It has been shown that prior knowledge and information are organized according to categories, and that also background knowledge plays an important role in classification. The purpose of this study is first, to investigate the relationship between background knowledge and text classification, and second, to incorporate this relationship in a computational model. Our behavioral results demonstrate that participants with access to background knowledge (experts), overall performed significantly better than those without access to this knowledge (novices). More importantly, we show that experts rely more on relational features than surface features, an aspect that bagof-words methods fail to capture. We then propose a computational model for text classification which incorporates background knowledge. This model is built upon vector-based representation methods and achieves significantly more accurate results over other models that were tested.
منابع مشابه
Integrating Background Knowledge Into Text Classification
We present a description of three different algorithms that use background knowledge to improve text classifiers. One uses the background knowledge as an index into the set of training examples. The second method uses background knowledge to reexpress the training examples. The last method treats pieces of background knowledge as unlabeled examples, and actually classifies them. The choice of b...
متن کاملIntegrating Background Knowledge into Nearest-Neighbor Text Classification
This paper describes two different approaches for incorporating background knowledgeinto nearest-neighbor text classification.Our first approachuses backgroundtext to assessthe similarity betweentraining and test documentsrather than assessing their similarity directly. The second method redescribes examples using Latent Semantic Indexing on the background knowledge, assessing document similari...
متن کاملIncorporating Background Knowledge into Text Categorization for Improved Sentiment Analysis
In many applications, we not only have training examples that enable supervised learning, but, we also have background knowledge about the task. Traditionally, supervised learning algorithms tend to focus only on learning from the training examples. However, ignoring the background knowledge, in lieu of training data, is sub-optimal. As an alternative in the realm of text classification, we pro...
متن کاملContent-based Text Categorization using Wikitology
The process of text categorization assigns labels or categories to each text document according to the semantic content of the document. The traditional approaches to text categorization used features from the text like: words, phrases, and concepts hierarchies to represent and reduce the dimensionality of the documents. Recently, researchers addressed this brittleness by incorporating backgrou...
متن کاملIncorporating Word Embeddings into Open Directory Project based Large-scale Classification
Recently, implicit representation models, such as embedding or deep learning, have been successfully adopted to text classification task due to their outstanding performance. However, these approaches are limited to smallor moderate-scale text classification. Explicit representation models are often used in a large-scale text classification, like the Open Directory Project (ODP)-based text clas...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015